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Abstract 

In this study, a two-state Markov switching count-data model is proposed as an alter- 
native to zero-inflated models to account for the preponderance of zeros sometimes 
observed in transportation count data, such as the number of accidents occurring on 
a roadway segment over some period of time. For this accident-frequency case, zero- 
inflated models assume the existence of two states: one of the states is a zero-accident 
count state, in which accident probabilities are so low that they cannot be statisti- 
cally distinguished from zero, and the other state is a normal count state, in which 
counts can be non-negative integers that are generated by some counting process, 
for example, a Poisson or negative binomial. In contrast to zero-inflated models, 
Markov switching models allow specific roadway segments to switch between the 
two states over time. An important advantage of this Markov switching approach 
is that it allows for the direct statistical estimation of the specific roadway-segment 
state (i.e., zero or count state) whereas traditional zero-inflated models do not. To 
demonstrate the applicability of this approach, a two-state Markov switching nega- 
tive binomial model (estimated with Bayesian inference) and standard zero-inflated 
negative binomial models are estimated using flve-year accident frequencies on Indi- 
ana interstate highway segments. It is shown that the Markov switching model is a 
viable alternative and results in a superior statistical flt relative to the zero-inflated 
models. 
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1 Introduction 



The preponderance of zeros observed in many count-data applications has lead 
researchers to consider the possibility that two states exist; one state that is a 
"zero" state (where all counts are zero) and the other that is a normal count 
state that includes zeros and positive integers. This two-state assumption has 
led to the development of zero-inflated Poisson models and zero-inflated neg- 
ative binomial models to account for possible over dispersion in the normal- 
count state. These zero -inflated models have been applied to a number of fields 
of study. For example, iLambertl (119921 ) used a zero-inflated Poisson model to 
study manufacturing defects. Lambert argued that unobserved changes in the 
process caused manufacturing defects to move randomly between a state that 
was near perfect (the zero state where defects were extremely rare) and an im- 
perfect state where defects were possible but not inevitable (the normal count 
state). Lamberts empirical assessment demonstrated that the zero-inflated 
modeling approach fit the data much better than the standard Poisson. In 
other work, van den iBroekl (119951 ) provided an application of the zero-infiated 
Poisson to the frequency of urinary tract infections in men diagnosed with the 
human immunodeficiency virus (HIV). In this case, it was postulated that a 
zero-infection state existed for a portion of the patient population and that 
this state generated a large number of zeros in the frequency dat a, which was 
supported by the statistical findings. Also, iBohning et al.l (119991 ) successfully 
applied the zero-inflated Poisson to study the frequency of dental decay in 
Portugal. 



The frequency of vehicle accidents on a section of highway or at an intersection 
(over some time period) often exhibit excess zeros. Similar to the literature 
discussed above, the excess of zeros observed in the data could potentially be 
expla ined by the existence of a two-state process for accident data genera- 



tion (IShankar et al.l . 119971 : ICarson and Manneringl . l200ll : iLee and Mannering 



2OO2I ). In this case, roadway segments can belong to one of two states: a 
zero-accident state (where zero accidents are expected) and a normal-count 
state, in which accidents can happen and accident frequencies are generated 
by some given counting process (Poisson or negative binomial). To account for 
the two-state phenomena, zero-inflated Poisson (ZIP) and zero-inflated nega- 
tive bin omial (ZINB) models have been used in a number of roadwa y safety 



studies (iMiaoul . ll994J : IShankar et all . 119971 : IWashington et al.l . 120031 ). These 



models explicitly account for an existence of the two states for accident data 
generation and allow modeling of the probabilities of being in these states. 



An application of ZIP and ZINB models was an empirical advance in statisti- 
cal modeling of accident frequencies. However, although zero-inflated models 
have become popular in a number of flelds, they suffer from two important 
drawbacks. First, these models do not deal directly with the states of road- 
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way segments, instead they consider probabilities of being in these states. As 
a result, zero- inflated models do not allow a direct statistical estimation of 
whether individual roadway segments are in the zero or normal count state. 
For example, suppose a given roadway segment has zero accidents observed 
over a given time interval. This segment could truly be in the zero-accident 
count state, or it may be in the normal-count st ate and just happened to have 



zero accidents over the considered time interval (IShankar et al.l . 119971 ). Distin- 
guishing between these two possibilities is not straightforward in zero-inflated 
models. The second drawback of zero-inflated models is that, although they 
allow roadway segments to be in different states during different observation 
periods, zero-inflated models do not explicitly consider switching by the road- 
way segments between the states over time. This switching is important from 
the theoretical point of view because it is unreasonable to expect any roadway 
segment to be in the zero-accident all the time and to have the long-term 



mean accident frequency equal to zero (ILord et al.l . l2005l ) . 



In this study, we propose two-state Markov switching count-data models that 
consider the zero-accident state and the normal-count state of roadway safety. 
Similar to zero-inflated models, Markov switching models are intended to ex- 
plain the preponderance of zeros observed in accident count data. However, 
in contrast to zero-inflated models, Markov switching models allow a direct 
statistical estimation of the states roadway segments are in at specific points 
in time and explicitly consider changes in these states over time. 



2 Model specification 



Two-state Mark ov switching count-data models of accident frequencies were 
first presented in lMalyshkina et al.l (120091 ). Following that paper, we note that , 
although there are several major differences between iMalyshkina et al.l (120091 ) 
and this study, raany id eas and statistical estimation methods developed in 
Malyshkina et al. (l2009h apply in this study as well. In that paper, two states 
were assumed to exist but both were true count states (i.e., a zero-count 
state did not exist). In the current paper, we take a different approach and 
consider the case where one of the states is a zero state and the other is a 
true count state and that individual r oadway segments r nove b etween these 
two states over time. This differs from IMalyshkina et al.l (120091 ) in that their 
model assumes two true-count states and that all roadway segments are in the 
same state at the same time. 



To show this model, we note that Markov switching models are parametric 
and can be fully specified by a likelihood function f(Y\@,Ai), which is the 
conditional probability distribution of the vector of all observations Y, given 
the vector of all parameters of model A4. In our study, we observe the 
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number of accidents At^n that occur on the n^^ roadway segment during time 
period t. Thus Y = {Ai,ri} includes all accidents observed on all roadway 
segments over all time periods. Here n = 1,2, ... ,N and t = 1,2, ... ,T, 
where is the total number of roadway segments observed (it is assumed to 
be constant over time) and T is the total number of time periods. Model Ai = 
{M, Xt,n} includes the model's name M (for example, M = "ZIP" or "ZINB") 
and the vector Xt^„ of all roadway segment characteristic variables (segment 
length, curve characteristics, grades, pavement properties, and so on). 

To define the likelihood function, we introduce an unobserved (latent) state 
variable St^n, which determines the state of the n^^ roadway segment during 
time period t. Without loss of generality, it is assumed assume that the state 
variable St^n can take on the following two values: St^n = corresponds to 
the zero-accident state, and St^n = 1 corresponds to the normal-count state 
[n = 1,2, . . . , N and t = 1, 2, . . . , T). It is further assumed that, for each road- 
way segment n, the state variable St,n follows a stationary two-state Markov 
chain process in timejZl which can be specified by time-independent transition 
probabilities as 



P{St+l,n = = 0) = p'^^ii, P{St+l,n = 0\St,n = 1) = pKo' (1) 

Here, for example, P{st+i^n = M^t,n = 0) is the conditional probability of 
St+i^n = 1 at time t -|- 1, given that St^n = at time t. Transition probabilities 
PqIIi and Pillo are unknown parameters to be estimated from accident data 
(n = 1, 2, . . . , A^). The stationary unconditional probabilities of states St^n = 

and st^n = 1 are pj,"^ = Pi"lo/(p£i + pSo) and = p£i/(Po"li + ptlo) 
respectively!!] If Po^i < Pi^o, then Pq"'' > p^"^ and, on average, for roadway 
segment n state St^n = occurs more frequently than state St^n = 1. If plyli > 
Pi^lo, then state St^n = 1 occurs more frequently for segment nH] 

Next, consider a two-state Markov switching negative binomial (MSNB) model 
that assumes a negative binomial (NB) data-generating process in the normal- 
count state St,n = 1- With this, the probability of At^n accidents occurring on 
roadway segment n during time period t is 



^ Markov property means that the probability distribution of st+i^n depends only 
on the value st^n at time t, but not on the previous history st-i, ■ ■ ■■ Stationarity 
of {st,n} is in the statistical sense. 

^ These can be found from stationarity conditions Pq"^ = [1 — PQ^j^jpg"^ +P^i11qP^^\ 

= Ptlivt^ + [1 - pfUpf^ and pf^ + = 1- _ 

Here, Eq. ([T]) is a significant departure from iMalyshkina et al.l ( 20091 ) in that in- 
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dividual roadway segments can be in different states at the same t ime (i.e., the state 



variable is subscripted by roadway segment n). Also, in contrast to IMalyshkina et al 



(|2009l l. lere we do not restrict state st^n = to be more frequent than state stn = 1- 
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s = l 



5 = 



a; 




^(A,JP(o)) or 



) or 



^<---P(4,JP(,,) or 



t+l,n 



or 



Fig. 1. Graphical demonstration of a two-state Markov switching model. 
(A) _ J if St,n = 

t,n — \ ; 

I{At,n) = { 1 if = and if > } , 



At,„ = exp(/3'Xt,„), t = 1, 2, . . . , T, n 



1 + aXt^n, 



1,2, 



(2) 

(3) 
(4) 
(5) 



Here, Eq. ([3]) is the probability mass function that reflects the fact that acci- 
dents never happen in the zero-accident state St^n = 00 Eq. (jlj) is the standard 
negative binomial probability mass function, r( ) is the gamma function, and 
prime means transpose (so f3' is the transpose of f3) . Parameter vector (3 and 
the over-dispersion parameter a > are unknown estimable model parame- 
tersH] Scalars Aj_„ are the accident rates in the normal-count state. We set 
the first component of X(.„ to unity, and, therefore, the first component of f3 
is the intercept. 

A two-state Markov switching model of accident frequencies is graphically 
demonstrated in Figure [TJ In the two states s = and s = 1 shown in the 
figure, the accident frequency data are generated by two different processes, 
shown by the circles (for state s = 0) and the diamonds (for s = 1). In this 
study, we assume that accident frequency is generated according to the zero- 
accident distribution X{At^n) in state s = 0, and according to the standard 



^ Although Eq. ^ formally assumes st n — to be a zero-accident state, in which 
accidents never happen, this state can be viewed as an approximation for a nearly 
safe state, in which the average accident rate is negligible {\t,n 1) and accidents 
are extremely rare (over the considered time period). 

^ To ensure that a is non-negative, we estimate its logarithm instead of it. 
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negative binomial distribution A^i3(y4f^„) in state s = 1 (these two distributions 
are outlined by the boxes in Figured]). The state variable St^n follows a Markov 
process over time, with transition probabilities Pq^q, Po^i^ Pvio Pi^Xi, as 
shown in Figure [H 

If accident events are assumed to be independent, the likelihood function is 



T N 

/(Yie,A^) = nn^*^- (6) 



t=l n=l 

Here, because the state variables St^n are unobservable, the vector of all es- 
timable parameters must include all states, in addition to all model param- 
eters (/?-s, a) and transition probabilities. Thus, = [/3', a,Po^i, . . . ,Po^\, 

P^iIq, . . . ,p[^o,S']', where vector S = [{si^i, st,i), ■ ■ ■ , {si^n, st,n)]' has 
length T X N and contains all state values. 

Eqs. ([I])-® define the two-state Markov switching negative binomial (MSNB) 
model considered here. Note that in this model the estimable state variables 
St^n explicitly specify the states of all roadway segments n = 1, 2, . . . , during 
all time periods t = 1, 2, . . . , T. 

In this study, in addition to the MSNB model, we also consider the standard 
zero-inflated negative binomi al (ZINB) models. In thi s case, the probability 



of At^n accidents occurring is (jWashington et al.l . l2003l ) 



Ptf = qt,nAAt,n) + (1 - qt,n)UB{At,r:), (7) 
_ 1 

1 + e ^ 

where we use two different specifications for the probability qt^n that the n^^ 
roadway segment is in the zero-accident state during time period t. The right- 
hand-side of Eq. ([7j) is a mixture of zero-accident distribution I{At,n) given by 
Eq. ([3]) and negative binomial distribution J\fB{At^n) given by Eq. (jlj). Scalar 
r and vector 7 are estimable model parameters. Accident rate Xt^n is given 
by Eq. We call "ZINB-r" the model specified by Eqs. ^ and ^. We 
call "ZINB-7" the model specified by Eqs. ([7j) and ([9]). Note that qt^n depends 
on the estimable model parameters and gives the probability of being in the 
zero-accident state St,n = 0, but it is not an estimable parameter by itself and 
does not explicitly specify the state value St^n- 
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3 Model estimation methods 



Statistical estimation of Markov switching models is complicated by unobserv- 
ability of the state variables St^„l£] As a result, the traditional maximum likeli- 
hood estimation (MLE) procedure is of very limited use for Markov switching 
models. Instead, a Bayesian inference approach is used. Given a model Ai 
with likelihood function /(Y|0, A^), the Bayes formula is 



/(eiY /(Y.e|A<) ^ /(Y|e.A<Me|A<) 

J[UlY,jv,, ^^^i^j ff(Y,e\M)d& ''"'> 

Here /(0|Y, Ai) is the posterior probability distribution of model parameters 
conditional on the observed data Y and model A4. Function f{Y,@\A4) 
is the joint probability distribution of Y and given model A4. Function 
f{Y\Ai) is the marginal likelihood function - the probability distribution of 
data Y given model Ai. Function 7r(0|A^) is the prior probability distribution 
of parameters that reflects prior knowledge about 0. The intuition behind 
Eq. (fTOl) is straightforward: given model tM, the posterior distribution accounts 
for both the observations Y and our prior knowledge of 0. 

In our study (and in most practical studies), the direct application of Eq. (ITO!) 
is not feasible because the parameter vector contains too many components, 
making integration over in Eq. ( JTOl) extremely difficult. However, the poste- 
rior distribution /(0|Y, Ai) in Eq. ( ITOl) is known up to its normalization con- 
stant, f{@\Y,M) oc f{Y\@,M)n{&\M). As a resuh, we use Markov Chain 
Monte Carlo (MCMC) simulations, which provide a convenient and practi- 
cal computational methodology for sampling from a probability distribution 
known up to a constant (the posterior distribution in our case). Given a large 
enough posterior sample of parameter vector 0, any posterior expectation and 
variance can be found and Bayesia n inference c a n be r eadily applied. A reader 



interested in details is referred to iMalyshkinal (120081 ). where we comprehen- 



sively describe our choice of the prior distribution 7r(0|A^) and the MCMC 
simulation algorithm^ We used MATLAB language for programming and 
running the MCMC simulations. 

For comparison of different models we use a formal Bayesian approach. Let 
there be two models Aii and Ai2 with parameter vectors 0i and 02 respec- 
tively. Assuming that we have equal preferences of these models, their prior 



^ Below we will have five time periods (T = 5) and 335 roadway segments (A^ = 
335). In this case, there are = 2^^''^ possible combinations for value of vector 
S = [(Sl,l, ST,l), . . . , (si,Af, sr,Af)]'- 

^ Our priors for a, /3-s, po^i and pi^o are flat or nearly flat, while the prior for 
the states S reflects the Markov process property, specified by Eq. ([1]). 
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probabilities are n{A4i) = 7r(A^2) = 1/2. In this case, the ratio of the models' 
posterior probabilities, P(A^i|Y) and P(A^2|Y), is equal to the Bayes fac- 
tor. The later is defined as the ratio of the models' marginal likelihoods (see 
Kass and Raftery . 19951 ). Thus, we have 



P{M2\Y) /(A^2,Y)//(Y) f{Y\M2)n{M2) f{Y\M, 



PiM,\Y) /(A<i,Y)//(Y) f{Y\MMM,) fiY\M, 



'111 



where f{M.i,Y) and f{A42,Y) are the joint distributions of th e models and 
the da ta, /(Y) is the unconditional distribution of the data. As in lMalyshkina et al 
(120091 ). to calculate the marginal likelihoods f{Y\Mi) and /(Y|A^2), we 
use the harmonic mean formula /(Y|A^)^^ = E [f{Y\@, Ai)^^\Y], where 
E{. . . |Y) means posterior expectation calculated by using the posterior dis- 
tribution. If the ratio in Eq. ffTTj) is larger than one, then model is favored, 
if the ratio is less than one, then model A^i is favored. An advantage of the 
use of Bayes factors is that it has an inherent penalty for including too many 
parameters in the model and guards against overfitting. 



To evaluate the performance of model |7W, Q| in fitting the observed data Y 



1998; 


Wood. 


2002; 


Press et al.. 


2007) 



simulations to find the distribution of the quantity, which m easures t he dis - 



crepancy between the observations and the model predictions ( ICowanl . Il998l ). 
This distribution is then used to find the goodness-of-fit p-value, which is the 
probability that exceeds the observed value of under the hypothesis that 
the model is true (the observed value of is calculated by us ing the observed 
data Y). For additional details, please see iMalyshkinal (120081 ). 



4 Empirical results 



Data are used from 5769 accidents that were observed on 335 interstate high- 
way segments in Indiana in 1995-1999. We use annual time periods, t = 
1, 2, 3, 4, T = 5 in totalis Thus, for each roadway segment n = 1,2, . . . , N = 
335 the state St,n can change every year. Four types of accident frequency 
models are estimated: 

(1) First, for the purpose of explanatory variable selection, we estimate an 
auxiliary standard negative binomial (NB) model, which is not reported 
here. We estimate this model by maximum likelihood estimation (MLE). 
To obtain a standard NB model, we choose explanatory variables and 

^ We also considered quarterly time periods and obtained qualitatively similar re- 
sults (not reported here). 
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their dummies by using the Akaike Information Criterion (AIC)0 and 
the 5% statistical significance level for the two-tailed t-te st (for details 
on our variable selection methods, see Malyshkina . 20061 ). In order to 



make a comparison of explanatory variable effects in different models 
straightforward, in all other models, described below, we use only those 
explanatory variables that enter the standard NB model 



10 



(2) We estimate the standard ZINB-r model, specified by Eqs. ©-([H]). First, 
we estimate this model by maximum likelihood estimation (MLE) and 
use the 5% statistical significance level for evaluation of the statistical 
significance of each /5-parameter. Second, we estimate the same ZINB-r 
model by the Bayesian inference approach and MCMC simulations. As 
one expects, the Bayesian-MCMC estimation results turned out to be 
similar to the MLE estimation results for the ZINB-r model. 

(3) We estimate the standard ZINB-7 model, specified by Eqs. dH]), ([ZD and 
First, we estimate this model by MLE and use the 5% statistical sig- 
nificance level for evaluation of the statistical significance of each j3- 
parameter. Second, we estimate the same ZINB-7 model by the Bayesian 
inference approach and MCMC simulations. The Bayesian-MCMC and 
the MLE estimation results for the ZINB-7 model turned out to be sim- 
ilar. 

(4) We estimate the two-state Markov switching negative binomial (MSNB) 
model, specified by Eqs. ([I])-([6]), by the Bayesian-MCMC methods. We 
consecutively construct and use 60%, 85% and 95% Bayesian credible in- 
tervals for evaluation of the statistical significance of each /5-parameter 
in the MSNB model. As a result, in the final MSNB model some com- 
ponents of /3 are restricted to zero No restriction is imposed on the 
over-dispersion parameter a, which turns out to be significant anyway. 

The model estimation results for accident frequencies are given in Table [TJ 
Continuous model parameters, /3-s and a, are given together with their 95% 
confidence intervals (if MLE) or 95% credible intervals (if Bayesian-MCMC), 
refer to the superscript and subscript numbers adjacent to parameter esti- 
mates in Table inR Table 121 gives summary statistics of all roadway segment 



^ Minimization of AIC = 2K — 2LL, were K is the number of free continuous model 
parameters and LL is the log-likelihood, en sures an optimal choice of exp l anato ry 



variables in a model and avoids overfitting ( Tsav . 2002 : Washington et al. . 20031 ) . 



A formal Bayesian approach to model variable selection is based on evaluation 
of model's marginal likelihood and the Bayes factor (jlip . Unfortunately, because 
MCMC simulations are computationally expensive, evaluation of marginal likeli- 
hoods for a large number of trial models is not feasible in our study. 

A /3-parameter is restricted to zero if it is statistically insignificant. A 1 — a 
credible interval is chosen in such way that the posterior probabilities of being below 
and above it are both equal to a/2 (we use significance levels a = 40%, 15%, 5%). 

Note that MLE assumes asymptotic normality of the estimates, resulting in con- 
fidence intervals being symmetric around the means (a 95% confidence interval is 
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characteristic variables Xt^„ (except the intercept). 

The estimation results show that the MSNB model is strongly favored by the 
empirical data, as compared to the standard ZINB models. Indeed, from Ta- 
ble [T] we see that the MSNB model provides considerable, 335.69 and 263.12, 
improvements of the logarithm of the marginal likelihood of the data as com- 



pared to the ZINB-r and ZINB-7 models! I Thus, from Eq. ffTTl) . we find that, 
given the accident data, the posterior probability of the MSNB model is larger 
than the probabihties of the ZINB-r and ZINB-7 models by e^^^-®^ and e^^^-^^ 
respectivelyP^ 

Let us now consider the maximum likelihood estimation (MLE) of the standard 
ZINB-r and ZINB-7 models and an imaginary MLE estimation of the MSNB 
model. Referring to Table [H the MLE gave maximum log-likelihood values 
—2502.67 and —2426.54 for the ZINB-r and ZINB-7 models. The maximum 
log-likelihood value observed during our MCMC simulations for the MSNB 
model is equal to —2049.45. An imaginary MLE, at its convergence, would 
give MSNB log-likelihood value that would be even larger than this observed 
value. Therefore, the MSNB model, if estimated by the MLE, would provide 
very large, at least 453.22 and 377.09, improvements in the maximum log- 
likelihood value over the ZINB-r and ZINB-7 models. These improvements 
would come with no increase or a decrease in the number of free continuous 
model parameters (/?-s, a, r, 7-s) that enter the likelihood function. 



±1.96 standard deviations around the mean). In contrast, Bayesian estimation does 
not require this assumption, and posterior distributions of parameters and Bayesian 
credible intervals are usually non-symmetric. 

We use the harmonic mean formula to calculate the values and the 95% confidence 
intervals of the log-marginal-likelihoods given in Table [Tl The confidence interv als 
are calculated by b ootstrap simulations. For details, see iMalyshkina et al. ( 20091 ) or 
Malvshkinal (jiooi). 



There are other frequently used model comparison criteria, for example, the de- 
viance information criterio n, DIG = 2E\ D(&)\'Y] — Z)(£'[0|Y]), where deviance 
D{&) = -21n[/(Y|e,7W)] (|Robertl . lioOlf ) . Models with smaller DIG are favored to 
models with larger DIG. We find DIG values 5037.3, 4891.4, 4261.5 for the ZINB-r, 
ZINB-7 and MSNB models respectively. This means that the MSNB model is fa- 
vored over the standard ZINB models. However, DIG is theoretically based on the 
assumption of asymptotic multi variate normality of the p osterior distribution, in 



which case DIG reduces to AIG ( Spiegelhalter et al. . 20021 ). As a result, we prefer 



to rely on a mathematically rigorous and formal Bayes factor approach to model 
selection, as given by Eq. ([TT]) . 
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Table 1 



Estimation results for models of accident frequency (the superscript and subscript numbers to the right of individual 
parameter estimates are 95% confidence/credible intervals - see text for further explanation) 



Variable 


ZINB-x'^ 


ZINB-T fc' 


MSNB'^ 
by MCMC 


by MLE 


by MCMC 


by MLE 


by MCMC 


/3- and a-parameters in Eq. ^ 


Intercept (constant term) 


-15.0li?j 


10.Z_jy 4 


11-0-14.8 


-11 6-*-2^ 


-i7.3i«:° 


Accident occurring on interstates 1-70 or 1-164 (dummy) 


-.683_-^l° 


-.685_[^ll 


71c:-. 602 
•'J^3_.829 


71 t:-.593 
■ '^^-.836 




Pavement quality index (PQl) average'* 


- 0122- °^*'^ 


r,i 99-. 00562 
.U1ZZ_ Q-^gg 


- 0140-"°''2'' 
.UltU_ 0217 


- 0143- 

■^^^'-'-.0221 


- 0163- ™**50 
.U1U0_ 0240 


Logarithm of road segment length (in miles) 


701 -832 


701 .829 
./Mi 754 


090.978 
■^^^.880 


qoq.993 
■^•^='.886 


007.929 
•°°'.845 


Number of ramps on the viewing side per lane per mile 


•226:?°° 


997. 306 

.149 


208-387 
.Z30 209 


004.394 
■■"^^.214 


0-1 7.404 
■''^ ' .230 


Number of lanes on a roadway 










l-iy.386 


Median configuration is depressed (dummy) 




1Q0.282 
■-^"'-'.0839 


201-319 
■^'-'^.0820 


202-325 
•^"^.0781 




Median barrier presence (dummy) 


J^-"t ■3-1.64 


-1 43-i-i'l 






-1 69-^°" 

i.DM_2 4g 


Width of the interior shoulder is less that 5 feet (dummy) 


Q0O.443 
.ozo 202 


090.434 
.OZO 211 


.435:^2 


407.569 
■^•3 '.307 


074.505 
•■3'^. 243 


Outside shoulder width (in feet) 


-.o48o::«i^^ 


n47S--0207 
-.U478_ Q74g 


— .0532_ (jgg7 


— n^'^2— -^^20 

.UOOZ_ Qgg7 


- 0537-'''2*'' 
.UOO(_ ogg2 


Outside barrier is absent (dummy) 






-■2451:^^3" 


■^^'-'-.389 


- 264" ■^^*' 
■^"^-.403 


Average annual daily traffic (AADT) 


X 10-5 


—4 14-3.31 
X 10-5 


-i-93:6:m 

X 10-5 


_1 01 -3.16 
^■^^-5.83 

X 10-5 


-3.78:2:02 

X 10-5 


Logarithm of average annual daily traffic 








1 ^2l-86 
1-''^1.15 


-1 052.34 


Number of bridges per mile 










-02141-°^ 


Maximum of reciprocal values of horizontal curve radii (in 1/milc) 


- 140--°^^° 


_ ;^4i-.0734 
■^^^-.208 


1 q^-.0559 
■1^ 213 


1 00-. 0593 
-.13»_ 217 




Percentage of single unit trucks (daily average) 


1 nol.84 
^■'^''.624 


.'-■^'3.646 


1 091.96 
^•■-'^.693 


1 091.96 
1-3^.691 


1-^^.688 


Number of changes per vertical profile along a roadway segment 


nc;Kc:.0930 
■^^''''.0180 


fic:f;9.0903 
.UODZ 0226 








Over-dispersion parameter a in NB models 


144. 183 
105 




ion.l68 
■^•^".0925 


149. 185 
■1^^.105 


-114-147 
■-^-^^.0847 



Table 1 



(Continued) 



Variable 


ZINB-1-'^ 


ZINB-t'' 


MSNB 
by MCMC 


by MLE 


by MCMC 


by MLE 


by MCMC 


T- and 7-parameters in Eqs. JSj and ^ 


The model parameter r in Eq. ||8ll 


^ -7r»— 1.45 

-l-72_2.oo 


1 rrn- 1.50 

"l-73_i.g8 








Intercept (constant term) 






no i41.3 
^•-'■^4.99 


2fi 547.0 
^"■^10. 9 


- 


Logarithm of road segment length (in miles) 






1 .1 — 942 


1 , — 1 03 
^•^-1.83 




Median barrier presence (dummy) 








A T ^ 5 90 

4.i6|:i? 




Average annual daily traffic (AADT) 


- 


- 


q oqIS.I 
^•^■^3.35 

X 10-5 


in 51^-* 

iU.Og 72 

X 10-5 


- 


Logarithm of average annual daily traffic 


- 


- 


o 00-. 901 
-^■»»_4.g6 


-3.281^:59 


- 


Mean accident rate {Xt,n for NB), averaged over all values of Xt.n 




3.38 




3.42 


3.88 


Standard deviation of accident rate (•\/At,n(l + ciXt^n) for NB), 
averaged over all values of explanatory variables Xt_„ 




2.14 




2.15 


2.13 


Total number of free model parameters (/3-s, 7-s, a and r) 


16 


16 


19 


19 


16 


Posterior average of the log-likelihood (LL) 




-2510.68:^5;6.i3 




— 2436 '34—2431.12 




Max(LL): estimated max. value of log-likelihood (LL) for MLE; 
maximum observed value of LL for Bayesian-MCMC 


-2502.67 

(MLE) 


-2503.21 

(observed) 


-2426.54 
(MLE) 


-2427.41 

(observed) 


-2049.45 

(observed) 


Logarithm of marginal likelihood of data (ln[/(Y| A^)]) 




— 251 Q on— 2516.95 




-2447 33-2443.93 
^^^'••^'^-2448.86 


91 S4 91-2186.70 
-2i84.2i_2i69.56 


Goodness-of-fit p-value 




0.005 




0.177 


0.191 


Maximum of the potential scale reduction factors (PSRF) 




1.01006 




1.02200 


1.02117 


Multivariate potential scale reduction factor (MPSRF) ° 




1.01023 




1.02302 


1.02189 



^ Standard (conventional) ZINB-t model estimated by maximum likelihood estimation (MLE) and Markov Chain Monte Carlo (MCMC) simulations. 
^ Standard ZINB-7 model estimated by maximum likelihood estimation (MLE) and Markov Chain Monte Carlo (MCMC) simulations. 

Two-state Markov switching negative binomial (MSNB) model where all reported parameters are for the normal-count state s = 1. 

The pavement quality index (PQI) is a composite measure of overall pavement quality evaluated on a to 100 scale. 
" PSRF/MPSRF are calculated separately/jointly for all continuous model parameters. PSRF and MPSRF are close to 1 for converged MCMC chains. 



Table 2 



Summary statistics of roadway segment characteristic variables 



Variable 


Mean 


Standard deviation 


Minimum 


Median 


Maximum 


Accident occurring on interstates 1-70 or 1-164 (dummy) 


.155 


.363 








1.00 


Pavement quality index (PQI) average^ 


88.6 


5.96 


69.0 


90.3 


98.5 


Logarithm of road segment length (in miles) 


-.901 


1.22 


-4.71 


-1.03 


2.44 


Number of ramps on the viewing side per lane per mile 


.138 


.408 








3.27 


Number of lanes on a roadway 


2.09 


.286 


2.00 


2.00 


3.00 


Median configuration is depressed (dummy) 


.630 


.484 





1.00 


1.00 


Median barrier presence (dummy) 


.161 


.368 








1 


Width of the interior shoulder is less that 5 feet (dummy) 


.696 


.461 





1.00 


1.00 


Outside shoulder width (in feet) 


11.3 


1.74 


6.20 


11.2 


21.8 


Outside barrier absence (dummy) 


.830 


.376 





1.00 


1.00 


Average annual daily traffic (AADT) 


3.03 X 10" 


2.89 X lO'* 


.944 X 10" 


1.65 X 10* 


14.3 X 10" 


Logarithm of average annual daily traffic 


10.0 


.623 


9.15 


9.71 


11.9 


Number of bridges per mile 


1.76 


8.14 








124 


Maximum of reciprocal values of horizontal curve radii (in 1/ mile) 


.650 


.632 





.589 


2.26 


Percentage of single unit trucks (daily average) 


.0859 


.0678 


.00975 


.0683 


.322 


Number of changes per vertical profile along a roadway segment 


.522 


.908 








6.00 



^ The pavement quality index (PQI) is a composite measure of overall pavement quality evaluated on a to 100 scale. 



To evaluate the goodness-of-fit for a model, we use the posterior (or MLE) 
estimates of all continuous model parameters (/5-s, a, p'^i, Pi^o) a nd g enerate 
10^ artificial data sets under the hypothesis that the model is truellH We find 
the distribution of and cal culate the goodnes s -of-fit p- value for the observed 
value of x^- For details, see ( IMalyshkina et al.l . l2009l ). The resulting p- values 
for our models are given in Table [H For the ZINB-7 and MSNB models the 
p- values are sufficiently large, around 20%, which indicates that these models 
fit the data reasonably well. At the same time, for the ZINB-r model the 
goodness-of-fit p-value is only around 0.5%, which indicates a much poorer 
fit. 
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The estimation results also show that the over-dispersion parameter a is higher 
for the ZINB-r and ZINB-7 models, as compared to the MSNB model (refer 
Tabled]). This suggests that over-dispersed volatility of accident frequencies, 
which is often observed in empirical data, could be in part due to the latent 
switching between the states of roadway safety. 

Now, refer to Figure [21 made for the case of the MSNB model. The four 
plots in this figure show five-year time series of the posterior probabilities 
P{st,n = 1|Y) of the normal-count state for four selected roadway segments. 
These plots represent the following four categories of roadway segments: 

(1) For roadway segments from the first category we have P{st^n = 1|Y) = 1 
for all t = 1,2,3,4,5. Thus, we can say with absolute certainty that 
these segments were always in the normal-count state = 1 during 
the considered five-year time interval. A roadway segment belongs to 
this category if and only if it had at least one accident during each year 
{t = 1,2,3,4,5). An example of such roadway segment is given in the 
top-left plot in Figure [2l For this segment the posterior expectation of 
the long-term unconditional probability pi of being in the normal-count 
state is large, E{pi\Y) = 0.750. 

(2) For roadway segments from the second category P{st^n = 1|Y) -C 1 for 
all t = 1,2,3,4,5. Thus, we can say with high degree of certainty that 
these segments were always in the zero-accident state Sf_„ = during 
the considered five-year time interval. A roadway segment n belongs to 
this category if it had no accidents observed over the five-year interval 
despite the accident rates given by Eq. ([5]) were large, Xt^n ^ 1 for all t = 
1, 2, 3, 4, 5. Clearly this segment would be unlikely to have zero accidents 



^ Note that the state values S are generated by using ^gJ-i ^'iJ-o- 

It is worth to mention that for the auxiUary standard negative binomial (NB) 
model, which we do not report here, the goodness-of-fit p-value was also very poor, 
^ 0.3%. This is an expected result because of a preponderance of zeros in the data, 
not accounted for in the NB model. 
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segment #1, E(p^|Y)=0.750 



segment #54, E(pJY)=0.260 
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0- 
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CL 




1 

0.8 




^ 0.6 




^\ 




^ 0.4 




CL 




0.2 











1995 1996 1997 1998 1999 

Date 



1995 1996 1997 1998 1999 
Date 



Fig. 2. Five-year time series of the posterior probabilities P{st^ 
normal-count state st,n = 1 for four selected roadway segments {t 



: 1|Y) of the 
,2,3,4,5). 

observed, if it were not in the zero-accident state all the timeP^ An 
example of such roadway segment is given in the top-right plot in Figure [2l 
For this segment E{pi\Y) = 0.260 is small. 
(3) For roadway segments from the third category Pjst.n = 1|Y) is neither 
one nor close to zero for all t = 1,2,3,4,511^ For these segments we 
cannot determine with high certainty what states these segments were 
in during years t = 1,2,3,4,5. A roadway segment n belongs to this 
category if it had no any accidents observed over the considered five- 
year time interval and the accident rates were not large, Xt^n ^ 1 for all 
t = 1,2,3,4,5. In fact, when Xt^n <C 1, the posterior probabilities of the 
two states are close to one-half, P{st,n = 1|Y) ^ P{st,n = 0|Y) ^ 0.5, 
and no inference about the value of the state variable St,n can be made. In 
this case of small accident frequencies, the observation of zero accidents 
is perfectly consistent with both states St^n = and St^n = 1- An example 



^'^ Note that the zero- accident state may e xist due to under-reporting of minor, 
low-severity accidents (jShankar et al.l . 119971 ) . 



If there were no Markov switching, which introduces time-dependence of states 
via Eqs. ([1]), then, assuming non-informative priors TT^St^n = 0) = Tr{st^n = 1) = 1/2 
for states Sj_„, the posterior probabilities P{st^n = 1|Y) would be either exactly 
equal to 1 (when At^n > 0) or necessarily below 1/2 (when At^n = 0). In other 
words, we would have P{st^n = 1|Y) ^ [0.5, 1) for any t and n. Even with Markov 
switching existent, in this study we have never found any P{st^n = IjY) close but 
not equal to 1, refer to the top plot in Figure [3l 
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ro 800- 
™ 600- 



■a 400- 




0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 

P(s,„=1|Y) 




0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 

E(pf'|Y) 

Fig. 3. Histograms of the posterior probabilities P{st^n = 1|Y) (the top plot) and 
of the posterior expectations £'[p["^|Y] (the bottom plot). Here t = 1,2,3,4,5 and 
n = 1,2, ... ,335. 



of a roadway segment from the third category is given in the bottom-left 
plot in Figure [21 For this segment E{pi\Y) = 0.496 is about one-half. 
(4) Finally, the fourth category is a mixture of the three categories described 
above. Roadway segments from this fourth category have posterior prob- 
abilities P{st^n = 1|Y) that change in time between the three possibil- 
ities given above. In particular, for some roadway segments we can say 
with high certainty that they changed their states in time from the zero- 
accident state St^n = to the normal-count state St^n = 1 or vice versa. An 
example of a roadway segment from the fourth category is given in the 
bottom-right plot in Figure [21 For this segment E{pi\Y) = 0.510 is about 
one-half. Thus we find a direct empirical evidence that some roadway 
segments do change their states over time. 

Next, it is useful to consider roadway segment statistics by state of roadway 
safety. Referring to Figure [31 a case is made for the MSNB model. The top plot 
in this figure shows the histogram of the posterior probabilities P{st^n = 1|Y) 
for all = 335 roadway segments during all T = 5 years (1675 values of St,n 
in total). For example, we find that during five years roadway segments had 
P{st,n = 1|Y) = 1 and were normal-count in 851 cases, and they had P{st^n = 
1|Y) < 0.2 and were likely to be zero-accident in 212 cases. The bottom plot in 
Figure [31 shows the histogram of the posterior expectations ii^[p^"^ |Y], where 
Pi""* = PqXiI {PoXi + Pi'^o) the stationary unconditional probabilities of 
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the normal-count state (see SectionED. We find that 0.2 < S[M"^|Y] < 0.8 for 
all segments n = 1,2, ... , 335. This means that in the long run, all roadway 
segments have significant probabilities of visiting both the zero-accident and 
the normal-count states. 

Finally, it is also worth mentioning that, in addition to negative binomial 
models, we estir aated Poisson mod els for the same accident data and obtained 
similar results ( Malyshkinal . 2008 ). In particular, we found that a two-state 



Markov switching Poisson (MSP) model, which has the Poisson likelihood 
function instead of the NB likelihood function in Eq. (jl]), is strongly favored 
by the empirical data as compared to standard zero-infiated Poisson models. 



5 Conclusions 



A number of important observations can be made with regard to our empir- 
ical findings. First, Markov switching count-data models provide a superior 
statistical fit for accident frequencies relative to standard zero-infiated mod- 
els. Second, Markov switching models, which explicitly consider transitions 
between the zero-accident state and the normal-count state over time, per- 
mit a direct empirical estimation of what states roadway segments are in at 
different time periods. In particular, we found evidence that some roadway 
segments changed their states over time (see the bottom-right plot in Fig- 
ure [2]). Third, note that the Markov switching models avoid a theoretically 
implausible assumption that some roadway segments are always zero-accident 
because, in these models, every segment has a non-zero probability of be- 
ing in the normal-count state. Indeed, the long-term unconditional mean of 
the accident rate for the ra*^ roadway segment is equal to Pi^\Xt,n)t, where 
Pi"'' = Pq^iI {pt^Xi +Pi'!lo) is the stationary probability of being in the normal- 
count state St^n = 1 and {Xt,n)t is the time average of the accident rate in the 
normal-count state [refer to Eq. This long-term mean is always above 
zero (see the bottom plot in Figure [3]), even for segments that were likely to 
be in the zero-accident state over the whole observed five-year time interval. 
Finally, we conclude that two-state Markov switching count-data models are 
likely to be a better alternative to zero-infiated models, in order to account 
for excess of zeros observed in accident-frequency data. 
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